Executing Communication-Intensive Irregular Programs Efficiently
نویسندگان
چکیده
We consider the problem of eÆciently executing completely irregular, communication-intensive parallel programs. Completely irregular programs are those whose number of parallel threads as well as the amount of computation performed in each thread vary during execution. Our programs run on MIMD computers with some form of space-slicing (partitioning) and time-slicing (scheduling) support. A hardware barrier synchronization mechanism is required to eÆciently implement the frequent communications of our programs, and this constrains the computer to a xed size partitioning policy. We compare the possible scheduling policies for irregular programs on xed size partitions: local scheduling and multi-gang scheduling, and prove that local scheduling does better. Then we introduce competitive analysis and formally analyze the online rebalancing algorithms required for eÆcient local scheduling under two scenarios: with full information and with partial information.
منابع مشابه
Reducing Communication Cost for Parallelizing Irregular Scientific Codes
In most cases of distributed memory computations, node programs are executed on processors according to the owner computes rule. However, owner computes rule is not best suited for irregular application codes. In irregular application codes, use of indirection in accessing left hand side array makes it difficult to partition the loop iterations, and because of use of indirection in accessing ri...
متن کاملA Hybrid Execution Model for Fine - Grained Languages onDistributed
While ne-grained concurrent languages can naturally capture concurrency in many irregular and dynamic problems, their exibility has generally resulted in poor execution eeciency. In such languages the computation consists of many small threads which are created dynamically and synchronized implicitly. In order to minimize the overhead of these operations, we propose a hybrid execution model whi...
متن کاملExecuting multithreaded programs efficiently
This thesis presents the theory, design, and implementation of Cilk (pronounced “silk”) and Cilk-NOW. Cilk is a C-based language and portable runtime system for programming and executing multithreaded parallel programs. Cilk-NOW is an implementation of the Cilk runtime system that transparently manages resources for parallel programs running on a network of workstations. Cilk is built around a ...
متن کاملA Hybrid Execution Model for Fine - Grained Languages
While ne-grained concurrent languages can naturally capture concurrency in many irregular and dynamic problems, their exibility has generally resulted in poor execution eeciency. In such languages the computation consists of many small threads which are created dynamically and synchronized implicitly. In order to minimize the overhead of these operations, we propose a hybrid execution model whi...
متن کاملTime-Shifted Modules: Exploiting Code Modularity for Fine Grain Parallelization
Multi-threaded processors and chip-multiprocessors execute concurrent threads in close physical proximity, potentially reducing the cost of synchronization and communication significantly and enabling the parallelization of programs at a fine grain. In this paper, we explore a source of fine-grain parallelism present in programs due to their modular nature. Concurrency is derived from executing...
متن کامل